Nature Methods — Latest Matching Preprints

1

NuGraph: Graph-Based Reasoning over 3D Primitives for Nucleus Segmentation Correction

Wang, M.; Liu, P.; Zhao, Y.; Wang, B.; Wan, J.; Nie, L.; Wei, D.

2026-05-19 neuroscience 10.64898/2026.05.16.725603 medRxiv

Top 0.1%

51.7%

Show abstract

Correcting segmentation errors in large-scale 3D nuclei reconstructions requires reasoning about which fragments belong to the same nucleus across densely packed regions. Existing correction methods rely on local pairwise fragment matching, which cannot resolve the global topology of nuclear clusters and fails to recover missing morphology. We propose NO_SCPLOWUC_SCPLOWGO_SCPLOWRAPHC_SCPLOW, a graph-based reasoning framework that operates over atomic 3D primitives obtained by decomposing erroneous masks. NO_SCPLOWUC_SCPLOWGO_SCPLOWRAPHC_SCPLOW encodes primitive geometry via a 3D point-cloud backbone and performs global relational reasoning through graph attention, capturing inter-primitive dependencies across entire clusters rather than isolated pairs. A primitive-proposal contrastive loss aligns local primitive features with nucleuslevel semantics, improving grouping accuracy in dense regions. The resulting proposals are then refined by a shaperefinement network that predicts signed distance fields to restore smooth morphology. To train without manual error annotations, we develop a self-supervised data engine that synthesizes realistic segmentation errors from clean nuclei labels. To benchmark correction at brain scale, we curate NucEMFix, the first brain-wide EM benchmark of nuclei error cases across FAFB and MICrONS (8,000+ annotated error nuclei). NO_SCPLOWUC_SCPLOWGO_SCPLOWRAPHC_SCPLOW attains 87.99% F1 on NucEMFix-F (FAFB) and 86.20% on NucEMFix-M (MICrONS), outperforming both re-segmentation baselines (e.g., +8.6% over nnU-Net) and pairwise correction methods, while reducing curation effort by over 100x relative to manual proofreading. Code and data are available at https://mingzhiwang618.github.io/NucEMFix.

2

Spatially resolved transcriptomic identification of thousands of neurons recorded in vivo.

Prankerd, I. H.; Shinn, M. H.; Shuker, P. C.; Zhou, Z.; Tilbury, R.; Duffield, J. A. M.; Maat, C. A.; Nicoloutsopoulos, D.; Ritoux, A.; Maglio Cauhy, P. V.; Orme, D.; Bourdenx, M.; Duff, K. E.; Bugeon, S.; Isogai, Y.; Harris, K. D.

2026-05-15 neuroscience 10.64898/2026.05.15.725413 medRxiv

Top 0.1%

47.1%

Show abstract

Transcriptomics has transformed our understanding of the brain, but assigning transcriptomic identities to neurons recorded in vivo remains challenging at scale. Existing platforms can pair transcriptomic identity with two-photon calcium imaging in small populations of approximately 100 neurons, but they require recorded cells to be sparse and therefore cannot be applied to large population recordings. Here, we present coppaFISH 3D, a spatially resolved transcriptomics method, and CASTalign, an in silico alignment framework, which together enable transcriptomic identification of thousands of simultaneously recorded cells. coppaFISH 3D detects hundreds of genes in thick 50m fixed sections while preserving tissue integrity, enabling both 3D registration to in vivo imaging and integration with immunofluorescence labelling. The platform is fully powered by open chemistry and open source software, runs on commodity hardware, and can be performed at very low cost per section. It therefore enables transcriptomic identification of recorded neurons at scale, making it possible to study how transcriptomic identity shapes activity in neural populations.

3

Simultaneous brain-wide single-cell recording resolves spatiotemporal memory architecture

Shi, D.; Hou, Y.; Yan, Y.; Zhang, T.-h.; Joesten, W. C.; Liu, P.; Wang, Y.; Gautam, M.; Lim, J.; Zheng, L.; Gould, J.; Ko, B.; Niu, X.; Cheng, M.-C.; Hsieh, J.-C.; Levet, F.; Cai, D.; Draelos, A.; Cai, D. J.; Wei, D.; Linghu, C.

2026-05-22 neuroscience 10.64898/2026.05.21.726120 medRxiv

Top 0.1%

42.6%

Show abstract

Many fundamental mammalian brain functions emerge from the coordinated activity of cells distributed across large, brain-wide networks. To understand these processes in healthy and diseased states, ideally one would simultaneously measure and analyze single-cell activity at the brain-wide scale, an enduring challenge for live-measurement approaches that often face an inherent tradeoff between spatial resolution and scale. Here, we present GLOBE (sinGle-cell spatiotemporaL recOrding Brain-widE), a technology for brain-wide single-cell recording of cellular activity in vivo with spatiotemporal resolution, physiological sensitivity, and parallelization-accelerated readout. GLOBE leverages genetically encoded intracellular protein tape recorders and a high-throughput computational platform for integrated image and signal analysis. GLOBE records analog signal amplitudes across a continuous time axis, requires only standard light microscopy for in situ readout, and is compatible with expansion microscopy and RNA readouts. We applied GLOBE to simultaneously record transcriptional activity of the immediate early gene Fos in up to 219,703 neurons simultaneously across a single mouse brain over 5.5 continuous days, with a timestamp precision of 3.1-6.7 hours (median absolute error), a local recording density of 69-90% of neurons per imaging field of view, and a post-mortem imaging readout speed of 2.9 seconds per neuron on average. GLOBE resolves the brain-wide spatiotemporal structure of single-cell activity, revealing that Fos transcriptional dynamics associated with fear learning and memory retrieval are distributed across the brain with region-specific temporal heterogeneity, and that the variance of this structure scales down as the number of sampled cells increases. We envision GLOBE to have broad applications for dissecting and decoding physiological and pathological processes at the brain-wide scale.

4

NeuVue: A scalable and customizable framework for electron microscopy proofreading

Xenes, D.; Kitchell, L. M.; Rivlin, P. K.; Martinez, H.; Rose, V.; Bishop, C.; Brodsky, R.; Celii, B.; Ellis-Joyce, J.; Luna, D.; Norman-Tenazas, R.; Ramsden, D.; Romero, K.; Villafane-Delgado, M.; Collman, F.; Gray-Roncal, W.; Reimer, J.; Wester, B.

2026-05-12 neuroscience 10.1101/2022.07.18.500521 medRxiv

Top 0.1%

40.9%

Show abstract

Connectomic reconstruction from large image volumes produces segmentation and synaptic-assignment errors that must be resolved to support downstream analyses. As datasets have grown larger and teams more distributed, proofreading has become a critical operational bottleneck. Workflows for proofreading and error correction have not scaled commensurately with connectomic data production and may not accommodate heterogeneous proofreader expertise and machine-generated candidate edits. New tools are therefore needed to organize, prioritize, and coordinate proofreading at volume scale. Here we present NeuVue, a task-management and prioritization framework that operationalizes proofreading through atomic, auditable tasks for individual and team review, multistage routing across proofreader cohorts, performance and volume-state tracking, and integration with community annotation, visualization, and analysis services. We report the use of NeuVue across two volumetric datasets, supporting scalable proofreading by over forty proofreaders and producing over fifty thousand edits. NeuVue provides a reproducible human-in-the-loop framework for generating, validating, and maintaining large connectomic datasets.

5

Triplet tumbling microscopy enables in situ quantification of protein complex assembly and dynamics

Lazzari-Dean, J. R.; Millett-Sikking, A.; Rao, P.; Jensvold, Z. D.; Baddock, H.; Ingaramo, M.; Nile, A. H.; York, A. G.; Preciado Lopez, M.

2026-05-11 biophysics 10.64898/2026.05.07.723557 medRxiv

Top 0.1%

39.7%

Show abstract

Protein-protein interactions (PPIs) mediate diverse cellular processes, but PPIs are typically characterized using reconstituted in vitro biochemical and biophysical approaches. Current approaches for PPI detection in living cells are limited in the scope of interactions they can capture and often require prior knowledge of the interacting partners. To close this gap, we developed triplet tumbling microscopy (TTM), which reveals the interactions of a tagged protein of interest in cells in real time. TTM reports protein complex size from rotational diffusion ("tumbling") by leveraging infrared-triggerable emission from triplet states to track tumbling over nanoseconds to hundreds of microseconds. These long-lived triplets overcome the size limitations of existing rotational diffusion-based approaches, enabling TTM to measure species from small protein complexes to organelle-scale beads. In living cells, we apply TTM to detect PPIs, quantify fraction bound, and distinguish protein complexes by size. We measure diverse types of interactions, including rapamycin-induced dimerization, p53 homo-oligomerization, and binding of the E3-ligase E6AP to the human papilloma virus 16 E6 protein. The required hardware is compatible with most fluorescent microscopes, making TTM a versatile way to extract molecular insights from the complex context of living cells. O_FIG O_LINKSMALLFIG WIDTH=200 HEIGHT=109 SRC="FIGDIR/small/723557v1_ufig1.gif" ALT="Figure 1"> View larger version (27K): org.highwire.dtl.DTLVardef@1e70768org.highwire.dtl.DTLVardef@974813org.highwire.dtl.DTLVardef@1fd122borg.highwire.dtl.DTLVardef@1b3da96_HPS_FORMAT_FIGEXP M_FIG C_FIG

6

Altair-dvOPM: an open-access platform for large-field three-dimensional tissue imaging

Ngo, T.; Faiyazuddin, M.; Nguyen, T. D.; Haug, J.; Shen, Q.; Gałecki, S.; Borges, H. M.; Chen, B.; Wang, X.; Zhu, H.; Pappas, S. S.; Voigt, F. F.; FIolka, R.; Dean, K. M.

2026-05-12 biophysics 10.64898/2026.05.08.723912 medRxiv

Top 0.1%

38.5%

Show abstract

Altair-dvOPM is an open-access direct-view oblique plane microscope designed for large-field, three-dimensional imaging of cleared and expanded tissue sections. By combining photographic-lens-based detection, externally launched oblique illumination and precision-registered modular baseplates, the system achieves micrometer-scale lateral resolution over a ~5.4 mm field of view without custom objectives or highly specialized alignment procedures. We demonstrate imaging across scales, from subcellular structures in expanded cells to centimeter-scale expanded tissue sections, and provide documentation, CAD files, Zemax models and open-source control software to support replication and extension.

7

OPUS-ET: Resolving Compositional and Conformational Heterogeneities of Biomolecules in Cryo-Electron Tomography

Luo, Z.; Chen, X.; Wang, Q.; Ma, J.

2026-05-20 molecular biology 10.1101/2025.11.21.688990 medRxiv

Top 0.1%

28.6%

Show abstract

Structural heterogeneity in biomolecules, arising from both compositional and conformational variability, limits resolution and interpretability of cryo-electron tomography (cryo-ET). Here, we present OPUS-ET, a deep learning framework that resolves multiscale heterogeneity throughout the cryo-ET workflow. OPUS-ET combines a composition decoder that captures compositional differences with a conformation decoder that models large-scale motions, thereby providing a hierarchical representation of structural heterogeneity. Starting from noisy template-matching candidates with templates of varying similarity or quality, OPUS-ET efficiently enriches target particle populations and delivers sub-nanometer in situ reconstructions in a single round. It leads to improved resolutions by up to 4.5 [A] over expert annotations or existing deep-learning approaches in four benchmark systems, and reveals continuous conformational landscapes capturing F-F flexible coupling in mitochondrial ATP synthase and tRNA-translocation intermediates in eukaryotic and bacterial ribosomes. Together, these results establish OPUS-ET as a powerful computational tool for linking particle purification, high-resolution reconstruction, and analysis of structural heterogeneity in cryo-ET, with demonstrated robustness to template quality, initial pose noise, and clustering parameters.

8

ParSeek: Accurate cryo-EM particle picking with a deep learning model trained on synthetic data

Qian, J.; Gong, Y.; Liu, F.; Huang, Y.; Guo, G.; Zhu, Y.; Huang, Q.

2026-05-11 molecular biology 10.64898/2026.05.07.720949 medRxiv

Top 0.1%

28.2%

Show abstract

Accurate particle picking from noisy cryo-EM micrographs is essential for high-resolution reconstruction. Current deep learning methods rely on manually annotated data, which is labor-intensive, subjective, and limits particle recall under low signal-to-noise ratio (SNR). Here we introduce ParSeek, an automated picker trained entirely on synthetic data without human annotation. Synthetic micrographs are generated by projecting known 3D structures into realistic background patches that reproduce experimental noise. On seven public cryo-EM datasets, ParSeek outperformed Topaz and CryoSegNet on four datasets, achieving the highest F1-score (up to 0.82) and reaching 0.63 on a challenging membrane protein dataset. Density maps from ParSeek-picked particles showed cross-correlation coefficients up to 0.995 with the reference and a minimal resolution difference of 0.1 [A]. ParSeek also overcame severe orientation bias on an influenza dataset, yielding a reasonable reconstruction. Applied to three experimental datasets (an antibody-antigen complex and two GPCRs), ParSeek enabled reconstructions at 5.0 [A], 4.0 [A], and 2.8 [A], respectively. The 2.8 [A] map resolved side-chain densities and ligand flexibility. This study establishes a fully synthetic-data-driven strategy that eliminates manual annotation for training cryo-EM deep-learning models, paving the way for automated, unbiased particle picking.

9

Recovering biological structure in sparse single-cell proteomics with GIRAFI

Zhong, H.; Chi, S.; Wong, R.; Rogalski, J.; Wang, Z.; Chan, S.; Bailey, M. L.; Ebrahimi, A.; Jayme, G.; Yin, J.; Gong, A.; Snutch, T. P.; Maier, C. S.; Marra, M. A.; Foster, L. J.; Tang, X.

2026-05-21 bioinformatics 10.64898/2026.05.18.726081 medRxiv

Top 0.1%

28.1%

Show abstract

Single-cell proteomics (SCP) based on liquid-chromatography mass-spectrometry resolves protein-level cellular heterogeneity, but interpretation remains limited by detection-linked sparsity. SCP profiles continuous, peptide-derived intensities and has lower throughput than single-cell RNA sequencing, making denoising methods for large-scale, count-based transcriptomics difficult to apply. Here we present GIRAFI, a graph-informed statistical learning framework that imputes missing values and reveals reproducible cell states by constraining inference to dataset-aware, prior-knowledge-informed protein neighborhoods. We evaluated GIRAFI across SCP datasets spanning diverse biological/technical contexts. In masking-based recovery experiments and cell-type-specific protein-protein interaction inference, GIRAFI outperformed existing methods, and matched bulk proteomics comparisons corroborated recovery accuracy and ablations supported the graph-informed design. Beyond reduced replicate- and source-associated technical structure, GIRAFI recovered ground-truth cell-type annotations, improved cell state-resolved pathway analysis, and enabled trajectory inference consistent with known time courses. These results establish graph-constrained imputation as an effective strategy for improving SCP robustness, biological structure, interpretation, and cross-dataset comparability.

10

A 37-million-particle dataset from over 250 experiments to accelerate data-driven cryo-EM analysis

Zamanos, A.; Kyrilis, F. L.; Koromilas, P.; Kastritis, P. L.; Panagakis, Y.

2026-05-03 bioinformatics 10.64898/2026.04.29.720997 medRxiv

Top 0.1%

27.7%

Show abstract

Cryogenic Electron Microscopy (cryo-EM) has revolutionized structural biology by enabling near-atomic-resolution structure determination of biological macromolecules. Central to cryo-EM analysis are particles, namely 2D projections of biomolecules extracted from micrographs, which serve as the primary input for 3D reconstruction. While data-driven methods have transformed other scientific domains, their impact on cryo-EM remains limited because existing particle datasets are too small, too narrow in protein diversity, and lack rich per-particle annotations. We introduce cryoPANDA (cryo-EM Particles ANnotated DAtaset), comprising over 37 million annotated particles from 252 experiments spanning a wide range of protein types, more than 10-fold larger than prior collections. Each particle is accompanied by detailed annotations covering acquisition, classification, and re-construction metadata, alongside the corresponding 3D electrostatic potential map, the published EMDB map, and, where available, the PDB model. We validate cryoPANDA in two ways: first, by reconstructing hundreds of distinct high-resolution cryo-EM maps; and second, by training a DINOv2 foundation model and evaluating its learned representations on micrograph segmentation, particle picking, and particle clustering.

11

Whole-brain protein profiling using organ-scale multiplexed immunolabeling and image co-registration

Kim, S.; Park, H.; Cho, W.; Yoo, S.; Charoenpattarawut, T.; Pearson, C. E.; Park, Y.-G.

2026-05-11 neuroscience 10.64898/2026.05.06.723275 medRxiv

Top 0.1%

27.4%

Show abstract

Proteins are major drivers of biological functions. Single-cell, organ-scale multiplexed protein imaging can reveal high-dimensional molecular and structural features of individual cells and their interactions, enabling an in-depth understanding of complex biological systems. However, such imaging has remained an elusive goal due to hurdles in multiplexed immunolabeling (mIF) of intact organs and integrative image analysis. Here, we present 3D CYCLIC, an organ-scale multiplexed immunolabeling technique, and TACTIC, a single-cell-level, organ-scale image co-registration algorithm. 3D CYCLIC combines ultrafast, versatile 3D immunolabeling with a cleavable crosslinker that preserves signals by protecting bound antibodies during optical clearing while enabling their detachment for subsequent rounds of immunolabeling. TACTIC uses deep warping networks coupled with a propagation-based cell-pair search to co-register individual cells across whole-brain images acquired from the same tissue across multiple rounds of 3D CYCLIC labeling. 3D CYCLIC enabled 6-plex protein profiling of a mouse brain hemisphere, with images that can be combined with TACTIC for integrative analysis. 3D CYCLIC and TACTIC will facilitate a holistic, unbiased understanding of diverse complex multicellular organ systems.

12

GatorDuo: Global-Consistency Dual-Graph Refinement With Pseudo-Label Agreement for Spatial Transcriptomics

Zhang, Z.; Jimeno Yepes, A.; Bian, J.; Li, F.; Liu, Y.

2026-05-13 bioinformatics 10.64898/2026.05.10.724039 medRxiv

Top 0.1%

26.5%

Show abstract

Spatial transcriptomics (ST) measures gene expression together with spatial coordinates, enabling spatial domain identification of coherent tissue regions. Many recent approaches rely on graph-based modeling to combine spatial neighborhoods and transcriptomic (gene-expression) similarity, yet neighborhood construction is often unreliable under sparsity and technical noise. As a result, spurious cross-domain shortcut edges can persist in static graphs and propagate misleading signals during message passing, ultimately blurring domain boundaries and weakening cluster separability. In this paper, we propose GatorDuo, a topology-aware dual-graph contrastive self-supervised framework for robust spatial domain identification that couples gene-expression similarity with spatial proximity through complementary neighborhood graphs. GatorDuo introduces global-consistency-based graph refinement that uses a pseudo-label agreement mask to suppress cross-domain shortcut edges in both views, thus stabilizing neighborhood topology for representation learning. To avoid manual tuning of domain resolution, GatorDuo further employs a contextual bandit reinforcement-learning strategy to adaptively select the clustering granularity (the number of clusters) used for refinement. The refined view-specific embeddings are integrated via a hybrid-routing Mixture-of-Experts (MoE) module to generate a unified embedding, optimized with contrastive objectives augmented by an MoE-alignment term. Across eight public benchmarks spanning sequencing- and imaging-based ST at spot and single-cell resolution, and compared with ten representative baselines, GatorDuo consistently delivers strong and robust spatial domain identification performance across multiple clustering metrics, while yielding informative unified embeddings that can support downstream biological analyses.

13

BiomniBench: Process-level Evaluation of LLM Agents for Real-world Biomedical Research

Qu, Y.; Lu, Y.; Tu, X.; Zhang, S.; She, T.; Shaw, A. G.; Shih, J.-H.; Zhao, B.; Shen, M.; Yang, H.; Yan, J.; Zhang, R.; Wu, X.; Li, T.; Cong, L.; Hu, X.; Jiang, Y.; Dong, J.; Peng, T.; Leskovec, J.; Huang, K.

2026-05-14 bioinformatics 10.64898/2026.05.12.724604 medRxiv

Top 0.1%

26.2%

Show abstract

LLM agents now perform real biomedical research, but evaluating them rigorously is hard. Outcome-only benchmarks fail in two ways. First, a correct final answer can come from memorization, reward hacking, or wrong reasoning that produces the right number by chance. Second, valid alternative analyses are marked wrong simply because they differ from the reference. We introduce BiomniBench, a process-level evaluation framework that scores the full agent trajectory against expert-designed, task-specific rubrics. Our first release, BiomniBench-DA, contains 100 data-analysis tasks across 17 task types, 5 disease areas, and a general-biology category, each based on a paper from journals such as Nature, Cell, and Science and co-developed with an original author or a domain expert. Benchmarking frontier and open-weight models across four agent harnesses reveals three findings. Frontier and open-weight bases cluster within a few points of each other, with substantial headroom for all models. The agent harness shifts scores by more than the gap between successive model generations. Agents reliably ground claims in real sources yet consistently fall short on method selection, biological interpretation, and scientific reasoning. BiomniBench is the first process-level benchmark for LLM agents in biomedical research, providing the dimension-level diagnostics that outcome scoring cannot. Datasethuggingface.co/datasets/phylobio/BiomniBench-DA

14

Atlas-Level Single-Cell and Spatial Transcriptomics Data Integration via PRIME

Wu, X.; Wang, X.; Wang, J.; Wan, S.

2026-05-23 bioinformatics 10.64898/2026.05.20.726698 medRxiv

Top 0.1%

26.0%

Show abstract

Single-cell RNA sequencing (scRNA-seq) and spatial transcriptomics (ST) have enabled atlas-scale cellular cartography, with consortium efforts now assembling millions of cells across diverse tissues, donors, and technologies to build comprehensive references for cell identify and disease mechanism, yet the scientific value of these atlases hinges on robust computational integration across heterogeneous data sources. Unlike pairwise batch correction, atlas-level integration must jointly reconcile heterogeneous and often hierarchically nested batch effects across many datasets whose cell-type compositions are highly imbalanced, all while preserving subtle biological variation and remaining computationally tractable at the scale of millions of cells. Existing approaches often prioritize either batch mixing or preservation of local biological structure, and most cannot natively accommodate spatial coordinates. Here we introduce PRIME (Projection-based Robust Integration via Manifold Embedding), an ensemble integration framework that combines random-projection-based consensus anchoring, graph-Laplacian correction, and optional spatial-neighborhood regularization. Across multiple random projections of the expression manifold, PRIME uses consensus voting to keep only cell pairs that repeatedly matched, reducing false anchors caused by projection-specific distortions. For ST, PRIME couples this expression-based anchor graph with a coordinate-derived spatial neighborhood graph in a unified graph-Laplacian objective with closed-form solution, enabling simultaneous cross-batch alignment and local spatial coherence. Based on extensive benchmarking spanning diverse datasets, we show that PRIME consistently outperforms state-of-the-art methods in both batch correction and biological conservation across scRNA-seq and ST integration scenarios and downstream tasks including trajectory inference, spatial-domain preservation, and perturbation-response analysis. Particularly, when integrating a human hematopoiesis benchmark spanning eight donors and approximately 33,000 cells, PRIME preserves biologically coherent developmental trajectories in human hematopoiesis. It also maintains cortical laminar architecture across dorsolateral prefrontal cortex sections in a ST dataset and recovers known drug-target relationships in a perturbation atlas of more than 1 million cells while suppressing batch-associated confounders. Together, these results establish PRIME as a versatile and scalable framework for atlas-level integration of scRNA-seq and ST across diverse biological applications.

15

ProtSpace: Protein Universe in Your Browser

Senoner, T.; Vahidi, P.; Olenyi, T.; Senoner, F.; Sisman, G.; Kahl, E.; Rost, B.; Koludarov, I.

2026-05-07 bioinformatics 10.64898/2026.05.04.722720 medRxiv

Top 0.2%

25.5%

Show abstract

AO_SCPLOWBSTRACTC_SCPLOWProtein Language Models (pLMs) generate per-protein embeddings that encode functional, structural, and evolutionary information, yet the relationships captured in these representations remain difficult to explore systematically. ProtSpace (https://protspace.app) is a web application for interactive visualization of pLM embedding spaces, enabling hypothesis generation directly in the browser without installation. Unlike traditional network-based tools that exclusively visualize amino acid sequence similarity, ProtSpace explores embedding spaces, revealing relationships often not captured by traditional comparisons. Users provide protein sequences or pre-computed embeddings through a Google Colab notebook or the Python CLI; the pipeline applies dimensionality reduction, retrieves 38 annotation types spanning UniProt, InterPro, NCBI Taxonomy, TED structural domains, and sequence-based predictors served via Biocentral, and produces a portable binary file for the browser-based viewer. WebGL-accelerated rendering supports interactive exploration of over 570,000 proteins. Distinctive features include per-point pie charts for multi-label annotations and integrated 3D structure viewing through AlphaFold2 predictions. All computation happens on the users machine, ensuring data privacy. We demonstrate the utility of ProtSpace through a progressive zoom-in across biological scales: from global proteome organization of Swiss-Prot, through cross-species comparison revealing conserved and lineage-specific families, to functional hypothesis generation within the beta-lactamase superfamily. ProtSpace is freely available at https://protspace.app under the Apache 2.0 license. KO_SCPLOWEYC_SCPLOWO_SCPCAP C_SCPCAPO_SCPLOWPOINTSC_SCPLOWO_LIProtSpace is a free, open-source web application that visualizes protein Language Model (pLM) embeddings as interactive maps, scaling to 570,000 proteins entirely client-side. C_LIO_LIA zero-installation Google Colab notebook and a Python CLI prepare visualization-ready bundles from FASTA files, UniProt queries, or pre-computed HDF5 embeddings, automatically retrieving 38 annotation types from five sources (UniProt, InterPro, NCBI Taxonomy, TED structural domains, and Biocentral sequence predictors) alongside custom CSV metadata. C_LIO_LIApplication examples demonstrate that embedding visualizations generate testable biological hypotheses at multiple scales, from proteome-wide organization through species-level comparison to family-level functional discovery, and that these are complementary to traditional sequence-based analyses. C_LI

16

Easymode: general pretrained networks for cellular cryo-ET enable flexible approaches to subtomogram averaging

So-Last, M. G. F.; Hale, T.; Burt, A.; Allegretti, M.

2026-05-21 molecular biology 10.64898/2026.05.19.726344 medRxiv

Top 0.2%

23.2%

Show abstract

Cellular cryo-electron tomography (cryo-ET) reveals high-resolution details of macromolecules within their native cellular environment. However, in situ cryo-ET datasets are large and highly heterogeneous, which makes comprehensive identification and extraction of the many different elements of cellular architecture for high-resolution analysis a challenging, time-consuming and often tedious task. Here we present easymode, a library of pretrained general segmentation networks for cryo-ET, trained on over 4,000 tilt series spanning a large and diverse variety of sources. Easymode enables in situ structural determination workflows by rendering tomogram content computationally accessible, without requiring any per-dataset training. Beyond directly facilitating high-resolution subtomogram averaging of a selection of widely prevalent complexes, we show how easymode can be used to leverage cellular context in subtomogram averaging workflows, helping identify, align, or filter particle sets, and enabling automated mapping of the cellular landscape surrounding target proteins. We use easymode to determine the in situ structure of rare inosine monophosphate dehydrogenase (IMPDH) filaments at 4.0 A resolution, and to map and visualize the surrounding cellular environment.

17

Robotic perturbation proteomics and AI agents enable scalable drug mechanism discovery

Jiang, Y.; Movassaghi, C. S.; Munoz-Estrada, J.; Sundararaman, N.; Momenzadeh, A.; Meyer, J. G.

2026-05-07 systems biology 10.64898/2026.05.04.722718 medRxiv

Top 0.2%

22.8%

Show abstract

Large-scale mass spectrometry-based proteomic screening could reveal cellular mechanisms of drug action at systems resolution but remains limited by experimental complexity and the difficulty of extracting insight from high-dimensional datasets. Here, we describe an end-to-end platform that combines semi-automated sample preparation, rapid LC-MS/MS, and AI agent-based data analysis to enable scalable proteomic screening. In a screen of 172 compounds in HepG2 cells, we generated 1,232 proteomes with more than 8,700 quantified proteins in approximately three weeks. Agentic AI reduced data analysis and interpretation time to less than one day while translating proteomic measurements into structured mechanism-oriented summaries and experimentally testable hypotheses. Guided by this framework, we validated: (1) a cholesterol-lowering effect of methylene blue in vitro and (2) an association between loratadine exposure and increased circulating iron in matched electronic health record analyses. This work establishes a scalable platform for generating proteomic drug perturbation data and automatically converting that data into mechanistic insights and candidate translational hypotheses using AI.

18

Automated Multimodal Correlative Registration for Organelle-Specific Molecular Imaging

Lu, C.; ZHAO, K.; Cui, D.; Chen, G.; Yang, Q.; Yang, H.; Zhao, M.; Song, K.; Nikan, M.; Li, Z.; Zhao, S.; Cen, J.; Qiu, X.; Young, S.; Bennett, C. F.; Seth, P.; Chen, K.; Qi, X.; Jiang, H.

2026-05-04 bioinformatics 10.64898/2026.04.30.721814 medRxiv

Top 0.2%

22.7%

Show abstract

Mapping subcellular drug distribution is essential for understanding trafficking and off-target effects. NanoSIMS enables chemical imaging of labeled therapeutics, but signal interpretation requires ultrastructural correlation with electron microscopy, a manual and laborious process. We present an automated AI-driven pipeline for correlating chemical and ultrastructural images, enabling multiscale, organelle-precise imaging of molecules in cells and tissues. The method integrates bidirectional optical flow, confidence-guided affine transformation, and automated template matching for cross-scale EM alignment. Morphology-rich ion channels (e.g., 32S) estimate transformations that propagate to sparse therapeutic signals (e.g., 79Br, 15N), overcoming low signal-to-noise challenges. We validate this framework across diverse cell and tissue types, tracking oligonucleotide and antibody therapeutics in vitro and in vivo to reveal cell-type- and organelle-specific distribution patterns. This work establishes a generalizable platform for automated multimodal registration and organelle-resolved subcellular pharmacology.

19

BARseq3: a modular system for integrating spatial multi-omics and cellular barcoding in single cells

Qi, H.; Anant, M. M.-G.; Faltine-Gonzalez, D. Z.; Hu, R.; Wei, L.; Workman, C. D.; Shi, C.; Del Rosario, I.; Kebschull, J. M.

2026-05-16 genomics 10.64898/2026.05.13.724900 medRxiv

Top 0.2%

22.5%

Show abstract

Understanding cellular identity requires multimodal measurements in single cells. Cellular barcoding provides powerful tools for recording the properties or history of individual cells in nucleic acids, while spatial omics techniques enable the measurement of a growing list of molecular features at micron resolution in tissue. However, existing methods that integrate these approaches in single samples are limited in the modalities they support, their flexibility, and efficiency. Here, we present BARseq3, a modular system that combines cellular barcoding with high-efficiency spatial transcriptomics and translatomics at subcellular resolution in tissue. BARseq3 is compatible with fixed samples, immunostaining, diverse species, and can be easily extended to include other spatial assays, enabling a multimodal understanding of cellular identity.

20

Static2Dynamic: Reconstructing videos of unobservable cellular, developmental, and disease processes

Boyer, T.; Del Nery, E.; Spassky, N.; Genovesio, A.

2026-05-20 bioinformatics 10.64898/2026.05.18.725860 medRxiv

Top 0.3%

22.2%

Show abstract

A fundamental limitation in biology is that many of its most important processes unfold as visual dynamics that cannot be directly observed. Development, tissue remodeling, and disease progression often occur deep in living organisms, over extended timescales, and at cellular resolution beyond the reach of current live imaging technologies. As a result, much of biology remains accessible only through static snapshots, while the underlying phenotypic trajectories and visual transformations remain hidden. Here, we introduce Static2Dynamic, a general framework to reconstruct unseen biological dynamics from sets of cross-sectional image data. Starting from time-unpaired static samples, Static2Dynamic first recovers a continuous pseudotime for individual images in a time-discriminative deep representation space, then learns a generative model of images conditionally to the underlying process, and finally reconstructs temporally coherent videos initialized from real samples. This makes it possible to infer past and future visual states of a static image and to simulate complete trajectories of cellular, developmental, and disease processes that were never directly recorded. We quantitatively validate Static2Dynamic on two large-scale experimental microscopy video datasets generated specifically for benchmarking, enabling direct comparison of inferred pseudotime trajectories and reconstructed videos against ground-truth biological dynamics. We further show that the framework generalizes across biological scales, organisms, and imaging modalities, including processes inaccessible to continuous live observation. More broadly, Static2Dynamic establishes the foundations of pseudotime microscopy, a new paradigm for reconstructing the visual and temporal dynamics of biological processes directly from static imaging data, thereby expanding the observable space of living systems beyond current experimental limits.